Factors Affecting the Accuracy of Korean Parsing
نویسندگان
چکیده
We investigate parsing accuracy on the Korean Treebank 2.0 with a number of different grammars. Comparisons among these grammars and to their English counterparts suggest different aspects of Korean that contribute to parsing difficulty. Our results indicate that the coarseness of the Treebank’s nonterminal set is a even greater problem than in the English Treebank. We also find that Korean’s relatively free word order does not impact parsing results as much as one might expect, but in fact the prevalence of zero pronouns accounts for a large portion of the difference between Korean and English parsing scores.
منابع مشابه
تأثیر ساختواژهها در تجزیه وابستگی زبان فارسی
Data-driven systems can be adapted to different languages and domains easily. Using this trend in dependency parsing was lead to introduce data-driven approaches. Existence of appreciate corpora that contain sentences and theirs associated dependency trees are the only pre-requirement in data-driven approaches. Despite obtaining high accurate results for dependency parsing task in English langu...
متن کاملAn improved joint model: POS tagging and dependency parsing
Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...
متن کاملProbabilistic Parsing of Korean Sentences Using Collocational Information
Lexical information is one of the most important source that can improve the accuracy of the syntactic disambigua-tion. This paper describes a Korean probabilistic parser that is based on the probabilities of phrase structure rules as well as the probabilities of collocational information between lexical items to resolve syntactic ambiguity. The proposed parser is shown by means of an extensive...
متن کاملبررسی مقایسهای تأثیر برچسبزنی مقولات دستوری بر تجزیه در پردازش خودکار زبان فارسی
In this paper, the role of Part-of-Speech (POS) tagging for parsing in automatic processing of the Persian language is studied. To this end, the impact of the quality of POS tagging as well as the impact of the quantity of information available in the POS tags on parsing are studied. To reach the goals, three parsing scenarios are proposed and compared. In the first scenario, the parser assigns...
متن کاملProbabilistic Language Model for Analyzing Korean Sentences
In this paper, we introduce a restricted form of phrase structure grammar to handle the characteristics of Korean more eeciently. Based on this restricted form of the grammar, we propose a probabilistic parser for Korean sentences. To show usefulness of the parser proposed in this paper, we made a preliminary experiment. We extract a set of rules from about 1,682 tree annotated sentences. The e...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010